199 research outputs found

    Parallel universes to improve the diagnosis of cardiac arrhythmias

    Get PDF
    We are interested in using parallel universes to learn interpretable models that can be subsequently used to automatically diagnose cardiac arrythmias. In our study, parallel universes are heterogeneous sources such as electrocardiograms, blood pressure measurements, phonocardiograms etc. that give relevant information about the cardiac state of a patient. To learn interpretable rules, we use an inductive logic programming (ILP) method on a symbolic version of our data. Aggregating the symbolic data coming from all the sources before learning, increases both the number of possible relations that can be learned and the richness of the language. We propose a two-step strategy to deal with these dimensionality problems when using ILP. First, rules are learned independently in each universe. Second, the learned rules are used to bias a new learning process from the aggregated data. The results show that this method is much more efficient than learning directly from the aggregated data. Furthermore the good accuracy results confirm the benefits of using multiple sources when trying to improve the diagnosis of cardiac arrythmias

    Constraint-based Subspace Clustering

    No full text
    International audienceIn high dimensional data, the general performance of traditional clustering algorithms decreases. This is partly because the similarity criterion used by these algorithms becomes inadequate in high dimensional space. Another reason is that some dimensions are likely to be irrelevant or contain noisy data, thus hiding a possible clustering. To overcome these problems, subspace clustering techniques, which can automatically find clusters in relevant subsets of dimensions, have been developed. However, due to the huge number of subspaces to consider, these techniques often lack efficiency. In this paper we propose to extend the framework of bottom up subspace clustering algorithms by integrating background knowledge and, in particular, instance-level constraints to speed up the enumeration of subspaces. We show how this new framework can be applied to both density and distance based bottom-up subspace clustering techniques. Our experiments on real datasets show that instance-level constraints cannot only increase the efficiency of the clustering process but also the accuracy of the resultant clustering

    Mining Mid-level Features for Image Classification

    No full text
    International audienceMid-level or semi-local features learnt using class-level information are potentially more distinctive than the traditional low-level local features constructed in a purely bottom-up fashion. At the same time they preserve some of the robustness properties with respect to occlusions and image clutter. In this paper we propose a new and effective scheme for extracting mid-level features for image classification, based on relevant pattern mining. In par- ticular, we mine relevant patterns of local compositions of densely sampled low-level features. We refer to the new set of obtained patterns as Frequent Local Histograms or FLHs. During this process, we pay special attention to keeping all the local histogram information and to selecting the most relevant reduced set of FLH patterns for classification. The careful choice of the visual primitives and an extension to exploit both local and global spatial information allow us to build powerful bag-of-FLH-based image representations. We show that these bag-of-FLHs are more discriminative than traditional bag-of-words and yield state-of-the-art results on various image classification benchmarks, including Pascal VOC

    On the benefits of self-taught learning for brain decoding

    Full text link
    We study the benefits of using a large public neuroimaging database composed of fMRI statistic maps, in a self-taught learning framework, for improving brain decoding on new tasks. First, we leverage the NeuroVault database to train, on a selection of relevant statistic maps, a convolutional autoencoder to reconstruct these maps. Then, we use this trained encoder to initialize a supervised convolutional neural network to classify tasks or cognitive processes of unseen statistic maps from large collections of the NeuroVault database. We show that such a self-taught learning process always improves the performance of the classifiers but the magnitude of the benefits strongly depends on the number of data available both for pre-training and finetuning the models and on the complexity of the targeted downstream task

    Learning rules from multisource data for cardiac monitoring

    Get PDF
    International audienceThis paper formalises the concept of learning symbolic rules from multisource data in a cardiac monitoring context. Our sources, electrocardiograms and arterial blood pressure measures, describe cardiac behaviours from different viewpoints. To learn interpretable rules, we use an Inductive Logic Programming (ILP) method. We develop an original strategy to cope with the dimensionality issues caused by using this ILP technique on a rich multisource language. The results show that our method greatly improves the feasibility and the efficiency of the process while staying accurate. They also confirm the benefits of using multiple sources to improve the diagnosis of cardiac arrhythmias

    UniRank: Unimodal Bandit Algorithm for Online Ranking

    Get PDF
    We tackle a new emerging problem, which is finding an optimal monopartite matching in a weighted graph. The semi-bandit version, where a full matching is sampled at each iteration, has been addressed by \cite{ADMA}, creating an algorithm with an expected regret matching O(Llog(L)Δlog(T))O(\frac{L\log(L)}{\Delta}\log(T)) with 2L2L players, TT iterations and a minimum reward gap Δ\Delta. We reduce this bound in two steps. First, as in \cite{GRAB} and \cite{UniRank} we use the unimodality property of the expected reward on the appropriate graph to design an algorithm with a regret in O(L1Δlog(T))O(L\frac{1}{\Delta}\log(T)). Secondly, we show that by moving the focus towards the main question `\emph{Is user ii better than user jj?}' this regret becomes O(LΔΔ~2log(T))O(L\frac{\Delta}{\tilde{\Delta}^2}\log(T)), where \Tilde{\Delta} > \Delta derives from a better way of comparing users. Some experimental results finally show these theoretical results are corroborated in practice

    CAWET: Context-Aware Worst-Case Execution Time Estimation Using Transformers

    Get PDF
    This paper presents CAWET, a hybrid worst-case program timing estimation technique. CAWET identifies the longest execution path using static techniques, whereas the worst-case execution time (WCET) of basic blocks is predicted using an advanced language processing technique called Transformer-XL. By employing Transformers-XL in CAWET, the execution context formed by previously executed basic blocks is taken into account, allowing for consideration of the micro-architecture of the processor pipeline without explicit modeling. Through a series of experiments on the TacleBench benchmarks, using different target processors (Arm Cortex M4, M7, and A53), our method is demonstrated to never underestimate WCETs and is shown to be less pessimistic than its competitors

    Recherche efficace de motifs fréquents dans des grilles

    No full text
    National audienceGeneral-purpose exhaustive graph mining algorithms are seldom used in real life contexts due to the high complexity of the process mostly based on costly isomorphism tests and countless expansion possibilities. In this paper, we show how to exploit grid-based representations to efficiently extract frequent grid subgraphs, and we introduce an efficient grid mining algorithm called GRIMA designed to scale to large amount of data. We apply our algorithm on image classification problems. Experiments show that our algorithm is efficient and that adding the structure may help the image classification process.La complexité des algorithmes de fouille de graphes généraux est telle qu'ils sont peu utilisés en pratique. Cette complexité est due à la fois aux tests d'isomor-phisme et au grand nombre de combinaisons permettant d'étendre un graphe durant le processus de fouille. Dans cet article, nous proposons d'exploiter des représenta-tions géométriques régulières (des grilles) pour recher-cher efficacement des motifs fréquents dans un ensemble de grilles. Nous présentons un algorithme appelé GRIMA qui, contrairement aux algorithmes généraux, peut passer l'échelle. Nous appliquons cet algorithme à un problème de classification d'images, pour lesquelles nous proposons une représentation par Sac de grilles. Les expérimenta-tions montrent l'efficacité de notre algorithme et l'intérêt d'utiliser une représentation structurée pour représenter les images

    GriMa: a Grid Mining Algorithm for Bag-of-Grid-Based Classification

    No full text
    International audienceGeneral-purpose exhaustive graph mining algorithms have seldom been used in real life contexts due to the high complexity of the process that is mostly based on costly isomorphism tests and countless expansion possibilities. In this paper, we explain how to exploit grid-based representations of problems to efficiently extract frequent grid subgraphs and create Bag-of-Grids which can be used as new features for classification purposes. We provide an efficient grid mining algorithm called GriMA which is designed to scale to large amount of data. We apply our algorithm on image classification problems where typical Bag-of-Visual-Words-based techniques are used. However, those techniques make use of limited spatial information in the image which could be beneficial to obtain more discriminative features. Experiments on different datasets show that our algorithm is efficient and that adding the structure may greatly help the image classification process
    corecore